-
Notifications
You must be signed in to change notification settings - Fork 784
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rework Validator Client fallback mechanism #4393
Conversation
2226ed9
to
7e96756
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Coming to approve Jimmy's last commit on this.
I've not reviewed this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Nice one, finally 😭 🎉
Will raise a separate issue to revisit the locks. I think #6413 (lockbud) will also help.
@mergify queue |
🛑 Branch protection settings are not validated anymoreBranch protection is enabled and is preventing Mergify to merge the pull request. Mergify will merge when branch protection settings validate the pull request once again. (detail: 1 review requesting changes and 2 approving reviews by reviewers with write access.) |
@mergify requeue |
✅ This pull request will be re-embarked automaticallyThe followup |
✅ The pull request has been merged automaticallyThe pull request has been merged automatically at f870b66 |
* Rework Validator Client fallback mechanism * Add CI workflow for fallback simulator * Tie-break with sync distance for non-synced nodes * Fix simulator * Cleanup unused code * More improvements * Add IsOptimistic enum for readability * Use configurable sync distance tiers * Fix tests * Combine status and health and improve logging * Fix nodes not being marked as available * Fix simulator * Fix tests again * Increase fallback simulator tolerance * Add http api endpoint * Fix todos and tests * Update simulator * Merge branch 'unstable' into vc-fallback * Add suggestions * Add id to ui endpoint * Remove unnecessary clones * Formatting * Merge branch 'unstable' into vc-fallback * Merge branch 'unstable' into vc-fallback * Fix flag tests * Merge branch 'unstable' into vc-fallback * Merge branch 'unstable' into vc-fallback * Fix conflicts * Merge branch 'unstable' into vc-fallback * Remove unnecessary pubs * Simplify `compute_distance_tier` and reduce notifier awaits * Use the more descriptive `user_index` instead of `id` * Combine sync distance tolerance flags into one * Merge branch 'unstable' into vc-fallback * Merge branch 'unstable' into vc-fallback * wip * Use new simulator from unstable * Fix cli text * Remove leftover files * Remove old commented code * Merge branch 'unstable' into vc-fallback * Update cli text * Silence candidate errors when pre-genesis * Merge branch 'unstable' into vc-fallback * Merge branch 'unstable' into vc-fallback * Retry on failure * Merge branch 'unstable' into vc-fallback * Merge branch 'unstable' into vc-fallback * Remove disable_run_on_all * Remove unused error variant * Fix out of date comment * Merge branch 'unstable' into vc-fallback * Remove unnecessary as_u64 * Remove more out of date comments * Use tokio RwLock and remove parking_lot * Merge branch 'unstable' into vc-fallback * Formatting * Ensure nodes are still added to total when not available * Allow VC to detect when BN comes online * Fix ui endpoint * Don't have block_service as an Option * Merge branch 'unstable' into vc-fallback * Clean up lifetimes and futures * Revert "Don't have block_service as an Option" This reverts commit b5445a0. * Merge branch 'unstable' into vc-fallback * Merge branch 'unstable' into vc-fallback * Improve rwlock sanitation using clones * Merge branch 'unstable' into vc-fallback * Drop read lock immediately by cloning the vec.
Issue Addressed
#3613
Proposed Changes
Each connected BN is queried every slot by the VC and is then sorted in a priority list. When attempting to perform duties, the VC will use the healthiest BN first (within a configurable tolerance).
Additional Info
RequireSynced
enum. This was scarcely used but may have implications when removed.OfflineOnFailure
enum. This was also scarcely used. In the new system, beacon nodes are never marked as offline so in essence haveOfflineOnFailure::No
by default.Remaining Work
CandidateBeaconNode
or remove it entirely./lighthouse/ui
endpoint which exposes the fallback health stats forsiren
.simulator
to test fallback mechanism.